Text Image Enhancement in Scenery Images for Degraded Character Recognition using DCT

نویسندگان

  • Hiroki Takahashi
  • Masayuki Nakajima
چکیده

This paper proposes a method to enhance text images for a system which extracts, recognizes and translates multi-lingual characters in scenery images captured by a digital camera. The proposed method magnifies text images in frequency domain. It restores high-frequency components by a DCT(Discrete Cosine Transform) based approach with an estimated enlarged image, and reduces mosquito and block noises caused by JPEG(Joint Photographic Experts Group) compression in the enhanced process. The obtained enhanced text images are binarized and then recognized by a commercial character recognition software. Experiments are performed for printed documents, sign boards and plates captured by a digital camera. In our experiments, texts are manually extracted from images because our goal in this paper is image enhancement. Compared with traditional approaches, the recognition ratio for our enhanced images by using a commercial character recognition software improves.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction of Original Text Document from a Set of Degraded Text Documents from the Same Source

Information extraction is the task of extracting structured data from a degraded document. It includes data extraction such as text, image or graphics from the sources such as an image, video or documents. Text detection and extraction from the degraded document finds application in wide range of study. In this paper, an Optical Character Recognition less (OCR-less) method of obtaining an origi...

متن کامل

Optical Character Recognition from Degraded Document Images

Segmentation of the text from badly degraded document images is very challenging tasks due to the high inter/intra variation between the document background and the foreground text of different types of document images. In this paper, a novel document image binarization technique is used to addresses the issues in the degraded document images by using adaptive image contrast. The adaptive image...

متن کامل

Text Extraction and Character Recognition form Image using Mathematical Morphology and OCR Technique

Images contain various types of useful information that should be extracted whenever required and this information may be in the form of text present in image. Extraction of this information involves detection, localization, extraction, enhancement and recognition of the text from the given image. Mathematical morphology is the foundation of morphological image processing, which consists of a s...

متن کامل

Improving the quality of images synthesized by discrete cosines transform – regression based method using principle component analysis

  Purpose: Different views of an individuals’ image may be required for proper face recognition.   Recently, discrete cosines transform (DCT) based method has been used to synthesize virtual   views of an image using only one frontal image. In this work the performance of two different   algorithms was examined to produce virtual views of one frontal image.   Materials and Methods: Two new meth...

متن کامل

Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents

Acquiring foreign language from degraded hardcopy documents is of interest to military and border control applications. Bi-tonal image scans are desirable because file size is small. However, the nature of hardcopy degradations and the scanner or image enhancement software capabilities used directly affect the quality of the captured image and the extent of language acquisition. We applied a co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005